!pip install lucifer-ml
Requirement already satisfied: lucifer-ml in c:\users\admin\anaconda3\lib\site-packages (0.0.57) Requirement already satisfied: numpy in c:\users\admin\anaconda3\lib\site-packages (from lucifer-ml) (1.19.5) Requirement already satisfied: scikit-learn in c:\users\admin\anaconda3\lib\site-packages (from lucifer-ml) (1.0) Requirement already satisfied: lightgbm in c:\users\admin\anaconda3\lib\site-packages (from lucifer-ml) (3.3.0) Requirement already satisfied: imblearn in c:\users\admin\anaconda3\lib\site-packages (from lucifer-ml) (0.0) Requirement already satisfied: seaborn in c:\users\admin\anaconda3\lib\site-packages (from lucifer-ml) (0.11.1) Requirement already satisfied: tensorflow in c:\users\admin\anaconda3\lib\site-packages (from lucifer-ml) (2.6.0) Requirement already satisfied: catboost in c:\users\admin\anaconda3\lib\site-packages (from lucifer-ml) (1.0.0) Requirement already satisfied: scipy in c:\users\admin\anaconda3\lib\site-packages (from lucifer-ml) (1.6.2) Requirement already satisfied: matplotlib in c:\users\admin\anaconda3\lib\site-packages (from lucifer-ml) (3.3.4) Requirement already satisfied: xgboost in c:\users\admin\anaconda3\lib\site-packages (from lucifer-ml) (1.4.2) Requirement already satisfied: pandas in c:\users\admin\anaconda3\lib\site-packages (from lucifer-ml) (1.2.4) Requirement already satisfied: graphviz in c:\users\admin\anaconda3\lib\site-packages (from catboost->lucifer-ml) (0.17) Requirement already satisfied: six in c:\users\admin\anaconda3\lib\site-packages (from catboost->lucifer-ml) (1.15.0) Requirement already satisfied: plotly in c:\users\admin\anaconda3\lib\site-packages (from catboost->lucifer-ml) (5.3.1) Requirement already satisfied: python-dateutil>=2.7.3 in c:\users\admin\anaconda3\lib\site-packages (from pandas->lucifer-ml) (2.8.1) Requirement already satisfied: pytz>=2017.3 in c:\users\admin\anaconda3\lib\site-packages (from pandas->lucifer-ml) (2021.1) Requirement already satisfied: imbalanced-learn in c:\users\admin\anaconda3\lib\site-packages (from imblearn->lucifer-ml) (0.8.0) Requirement already satisfied: joblib>=0.11 in c:\users\admin\anaconda3\lib\site-packages (from imbalanced-learn->imblearn->lucifer-ml) (1.0.1) Requirement already satisfied: threadpoolctl>=2.0.0 in c:\users\admin\anaconda3\lib\site-packages (from scikit-learn->lucifer-ml) (2.1.0) Requirement already satisfied: wheel in c:\users\admin\anaconda3\lib\site-packages (from lightgbm->lucifer-ml) (0.36.2) Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.3 in c:\users\admin\anaconda3\lib\site-packages (from matplotlib->lucifer-ml) (2.4.7) Requirement already satisfied: kiwisolver>=1.0.1 in c:\users\admin\anaconda3\lib\site-packages (from matplotlib->lucifer-ml) (1.3.1) Requirement already satisfied: cycler>=0.10 in c:\users\admin\anaconda3\lib\site-packages (from matplotlib->lucifer-ml) (0.10.0) Requirement already satisfied: pillow>=6.2.0 in c:\users\admin\anaconda3\lib\site-packages (from matplotlib->lucifer-ml) (8.2.0) Requirement already satisfied: tenacity>=6.2.0 in c:\users\admin\anaconda3\lib\site-packages (from plotly->catboost->lucifer-ml) (8.0.1) Requirement already satisfied: keras~=2.6 in c:\users\admin\anaconda3\lib\site-packages (from tensorflow->lucifer-ml) (2.6.0) Requirement already satisfied: protobuf>=3.9.2 in c:\users\admin\anaconda3\lib\site-packages (from tensorflow->lucifer-ml) (3.18.1) Requirement already satisfied: wrapt~=1.12.1 in c:\users\admin\anaconda3\lib\site-packages (from tensorflow->lucifer-ml) (1.12.1) Requirement already satisfied: typing-extensions~=3.7.4 in c:\users\admin\anaconda3\lib\site-packages (from tensorflow->lucifer-ml) (3.7.4.3) Requirement already satisfied: termcolor~=1.1.0 in c:\users\admin\anaconda3\lib\site-packages (from tensorflow->lucifer-ml) (1.1.0) Requirement already satisfied: keras-preprocessing~=1.1.2 in c:\users\admin\anaconda3\lib\site-packages (from tensorflow->lucifer-ml) (1.1.2) Requirement already satisfied: opt-einsum~=3.3.0 in c:\users\admin\anaconda3\lib\site-packages (from tensorflow->lucifer-ml) (3.3.0) Requirement already satisfied: astunparse~=1.6.3 in c:\users\admin\anaconda3\lib\site-packages (from tensorflow->lucifer-ml) (1.6.3) Requirement already satisfied: gast==0.4.0 in c:\users\admin\anaconda3\lib\site-packages (from tensorflow->lucifer-ml) (0.4.0) Requirement already satisfied: google-pasta~=0.2 in c:\users\admin\anaconda3\lib\site-packages (from tensorflow->lucifer-ml) (0.2.0) Requirement already satisfied: tensorflow-estimator~=2.6 in c:\users\admin\anaconda3\lib\site-packages (from tensorflow->lucifer-ml) (2.6.0) Requirement already satisfied: h5py~=3.1.0 in c:\users\admin\anaconda3\lib\site-packages (from tensorflow->lucifer-ml) (3.1.0) Requirement already satisfied: flatbuffers~=1.12.0 in c:\users\admin\anaconda3\lib\site-packages (from tensorflow->lucifer-ml) (1.12) Requirement already satisfied: grpcio<2.0,>=1.37.0 in c:\users\admin\anaconda3\lib\site-packages (from tensorflow->lucifer-ml) (1.41.0) Requirement already satisfied: clang~=5.0 in c:\users\admin\anaconda3\lib\site-packages (from tensorflow->lucifer-ml) (5.0) Requirement already satisfied: tensorboard~=2.6 in c:\users\admin\anaconda3\lib\site-packages (from tensorflow->lucifer-ml) (2.6.0) Requirement already satisfied: absl-py~=0.10 in c:\users\admin\anaconda3\lib\site-packages (from tensorflow->lucifer-ml) (0.14.1) Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1 in c:\users\admin\anaconda3\lib\site-packages (from tensorboard~=2.6->tensorflow->lucifer-ml) (0.4.6) Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in c:\users\admin\anaconda3\lib\site-packages (from tensorboard~=2.6->tensorflow->lucifer-ml) (1.8.0) Requirement already satisfied: requests<3,>=2.21.0 in c:\users\admin\anaconda3\lib\site-packages (from tensorboard~=2.6->tensorflow->lucifer-ml) (2.25.1) Requirement already satisfied: setuptools>=41.0.0 in c:\users\admin\anaconda3\lib\site-packages (from tensorboard~=2.6->tensorflow->lucifer-ml) (52.0.0.post20210125) Requirement already satisfied: werkzeug>=0.11.15 in c:\users\admin\anaconda3\lib\site-packages (from tensorboard~=2.6->tensorflow->lucifer-ml) (1.0.1) Requirement already satisfied: tensorboard-data-server<0.7.0,>=0.6.0 in c:\users\admin\anaconda3\lib\site-packages (from tensorboard~=2.6->tensorflow->lucifer-ml) (0.6.1) Requirement already satisfied: google-auth<2,>=1.6.3 in c:\users\admin\anaconda3\lib\site-packages (from tensorboard~=2.6->tensorflow->lucifer-ml) (1.35.0) Requirement already satisfied: markdown>=2.6.8 in c:\users\admin\anaconda3\lib\site-packages (from tensorboard~=2.6->tensorflow->lucifer-ml) (3.3.4) Requirement already satisfied: cachetools<5.0,>=2.0.0 in c:\users\admin\anaconda3\lib\site-packages (from google-auth<2,>=1.6.3->tensorboard~=2.6->tensorflow->lucifer-ml) (4.2.4) Requirement already satisfied: rsa<5,>=3.1.4 in c:\users\admin\anaconda3\lib\site-packages (from google-auth<2,>=1.6.3->tensorboard~=2.6->tensorflow->lucifer-ml) (4.7.2) Requirement already satisfied: pyasn1-modules>=0.2.1 in c:\users\admin\anaconda3\lib\site-packages (from google-auth<2,>=1.6.3->tensorboard~=2.6->tensorflow->lucifer-ml) (0.2.8) Requirement already satisfied: requests-oauthlib>=0.7.0 in c:\users\admin\anaconda3\lib\site-packages (from google-auth-oauthlib<0.5,>=0.4.1->tensorboard~=2.6->tensorflow->lucifer-ml) (1.3.0) Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in c:\users\admin\anaconda3\lib\site-packages (from pyasn1-modules>=0.2.1->google-auth<2,>=1.6.3->tensorboard~=2.6->tensorflow->lucifer-ml) (0.4.8) Requirement already satisfied: certifi>=2017.4.17 in c:\users\admin\anaconda3\lib\site-packages (from requests<3,>=2.21.0->tensorboard~=2.6->tensorflow->lucifer-ml) (2020.12.5) Requirement already satisfied: chardet<5,>=3.0.2 in c:\users\admin\anaconda3\lib\site-packages (from requests<3,>=2.21.0->tensorboard~=2.6->tensorflow->lucifer-ml) (4.0.0) Requirement already satisfied: urllib3<1.27,>=1.21.1 in c:\users\admin\anaconda3\lib\site-packages (from requests<3,>=2.21.0->tensorboard~=2.6->tensorflow->lucifer-ml) (1.26.4) Requirement already satisfied: idna<3,>=2.5 in c:\users\admin\anaconda3\lib\site-packages (from requests<3,>=2.21.0->tensorboard~=2.6->tensorflow->lucifer-ml) (2.10) Requirement already satisfied: oauthlib>=3.0.0 in c:\users\admin\anaconda3\lib\site-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tensorboard~=2.6->tensorflow->lucifer-ml) (3.1.1)
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from collections import Counter
from sklearn.model_selection import train_test_split
import plotly.graph_objects as go
import plotly.express as px
from luciferml.supervised import classification as cls
Telecom = pd.read_csv('TelcomCustomer-Churn_12.csv')
print(Telecom.shape)
Telecom.head()
(7043, 21)
| customerID | gender | SeniorCitizen | Partner | Dependents | tenure | PhoneService | MultipleLines | InternetService | OnlineSecurity | ... | DeviceProtection | TechSupport | StreamingTV | StreamingMovies | Contract | PaperlessBilling | PaymentMethod | MonthlyCharges | TotalCharges | Churn | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 7590-VHVEG | Female | 0 | Yes | No | 1 | No | No phone service | DSL | No | ... | No | No | No | No | Month-to-month | Yes | Electronic check | 29.85 | 29.85 | No |
| 1 | 5575-GNVDE | Male | 0 | No | No | 34 | Yes | No | DSL | Yes | ... | Yes | No | No | No | One year | No | Mailed check | 56.95 | 1889.5 | No |
| 2 | 3668-QPYBK | Male | 0 | No | No | 2 | Yes | No | DSL | Yes | ... | No | No | No | No | Month-to-month | Yes | Mailed check | 53.85 | 108.15 | Yes |
| 3 | 7795-CFOCW | Male | 0 | No | No | 45 | No | No phone service | DSL | Yes | ... | Yes | Yes | No | No | One year | No | Bank transfer (automatic) | 42.30 | 1840.75 | No |
| 4 | 9237-HQITU | Female | 0 | No | No | 2 | Yes | No | Fiber optic | No | ... | No | No | No | No | Month-to-month | Yes | Electronic check | 70.70 | 151.65 | Yes |
5 rows × 21 columns
Telecom.dtypes
customerID object gender object SeniorCitizen int64 Partner object Dependents object tenure int64 PhoneService object MultipleLines object InternetService object OnlineSecurity object OnlineBackup object DeviceProtection object TechSupport object StreamingTV object StreamingMovies object Contract object PaperlessBilling object PaymentMethod object MonthlyCharges float64 TotalCharges object Churn object dtype: object
print(Telecom.isnull().sum())
customerID 0 gender 0 SeniorCitizen 0 Partner 0 Dependents 0 tenure 0 PhoneService 0 MultipleLines 0 InternetService 0 OnlineSecurity 0 OnlineBackup 0 DeviceProtection 0 TechSupport 0 StreamingTV 0 StreamingMovies 0 Contract 0 PaperlessBilling 0 PaymentMethod 0 MonthlyCharges 0 TotalCharges 0 Churn 0 dtype: int64
Telecom.describe().T.style.bar(
subset=['mean'],
color='Reds').background_gradient(
subset=['std'], cmap='ocean').background_gradient(subset=['50%'], cmap='PuBu')
| count | mean | std | min | 25% | 50% | 75% | max | |
|---|---|---|---|---|---|---|---|---|
| SeniorCitizen | 7043.000000 | 0.162147 | 0.368612 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 1.000000 |
| tenure | 7043.000000 | 32.371149 | 24.559481 | 0.000000 | 9.000000 | 29.000000 | 55.000000 | 72.000000 |
| MonthlyCharges | 7043.000000 | 64.761692 | 30.090047 | 18.250000 | 35.500000 | 70.350000 | 89.850000 | 118.750000 |
Telecom.describe()
| SeniorCitizen | tenure | MonthlyCharges | |
|---|---|---|---|
| count | 7043.000000 | 7043.000000 | 7043.000000 |
| mean | 0.162147 | 32.371149 | 64.761692 |
| std | 0.368612 | 24.559481 | 30.090047 |
| min | 0.000000 | 0.000000 | 18.250000 |
| 25% | 0.000000 | 9.000000 | 35.500000 |
| 50% | 0.000000 | 29.000000 | 70.350000 |
| 75% | 0.000000 | 55.000000 | 89.850000 |
| max | 1.000000 | 72.000000 | 118.750000 |
Telecom.shape
(7043, 21)
churn_count = Telecom['Churn'].value_counts()
print(churn_count)
churn_count.plot.pie()
No 5174 Yes 1869 Name: Churn, dtype: int64
<AxesSubplot:ylabel='Churn'>
plt.figure(figsize=(6,5))
sns.countplot(x="Churn", data=Telecom, palette='viridis');
Telecom.columns
Index(['customerID', 'gender', 'SeniorCitizen', 'Partner', 'Dependents',
'tenure', 'PhoneService', 'MultipleLines', 'InternetService',
'OnlineSecurity', 'OnlineBackup', 'DeviceProtection', 'TechSupport',
'StreamingTV', 'StreamingMovies', 'Contract', 'PaperlessBilling',
'PaymentMethod', 'MonthlyCharges', 'TotalCharges', 'Churn'],
dtype='object')
def boxhistplot(columns,data):
fig = px.histogram(Telecom, x = Telecom[column], color = 'Churn')
fig.show()
fig2 = px.box(Telecom, x = Telecom[column], color = 'Churn')
fig2.show()
col = ['customerID', 'gender', 'SeniorCitizen', 'Partner', 'Dependents',
'tenure', 'PhoneService', 'MultipleLines', 'InternetService',
'OnlineSecurity', 'OnlineBackup', 'DeviceProtection', 'TechSupport',
'StreamingTV', 'StreamingMovies', 'Contract', 'PaperlessBilling',
'PaymentMethod', 'MonthlyCharges', 'TotalCharges']
for column in col:
boxhistplot(column,Telecom)
sns.heatmap(Telecom.corr(), annot=True, cmap="flag")
<AxesSubplot:>
sns.pairplot(Telecom, hue="Churn", palette="magma")
<seaborn.axisgrid.PairGrid at 0x202294d30a0>
from luciferml.supervised.classification import Classification
dataset = pd.read_csv('TelcomCustomer-Churn_12.csv')
features = [ 'gender', 'SeniorCitizen', 'Partner', 'Dependents',
'tenure', 'PhoneService', 'MultipleLines', 'InternetService',
'OnlineSecurity', 'OnlineBackup', 'DeviceProtection', 'TechSupport',
'StreamingTV', 'StreamingMovies', 'Contract', 'PaperlessBilling', 'MonthlyCharges', 'TotalCharges']
X = Telecom[features]
y = Telecom['Churn']
classifier = Classification(predictor = 'lr')
classifier.fit(X, y)
result = classifier.result()
accuracy_scores[result['Classifier']] = result['Accuracy']
██╗░░░░░██╗░░░██╗░█████╗░██╗███████╗███████╗██████╗░░░░░░░███╗░░░███╗██╗░░░░░
██║░░░░░██║░░░██║██╔══██╗██║██╔════╝██╔════╝██╔══██╗░░░░░░████╗░████║██║░░░░░
██║░░░░░██║░░░██║██║░░╚═╝██║█████╗░░█████╗░░██████╔╝█████╗██╔████╔██║██║░░░░░
██║░░░░░██║░░░██║██║░░██╗██║██╔══╝░░██╔══╝░░██╔══██╗╚════╝██║╚██╔╝██║██║░░░░░
███████╗╚██████╔╝╚█████╔╝██║██║░░░░░███████╗██║░░██║░░░░░░██║░╚═╝░██║███████╗
╚══════╝░╚═════╝░░╚════╝░╚═╝╚═╝░░░░░╚══════╝╚═╝░░╚═╝░░░░░░╚═╝░░░░░╚═╝╚══════╝
Started LuciferML
Checking if labels or features are categorical! [*]
Features are Categorical
Encoding Features [*]
Encoding Features Done [ ✓ ]
Labels are not categorical [ ✓ ]
Checking for Categorical Variables Done [ ✓ ]
Checking for Sparse Matrix [*]
Splitting Data into Train and Validation Sets [*]
Splitting Done [ ✓ ]
Scaling Training and Test Sets [*]
Scaling Done [ ✓ ]
Training Logistic Regression on Training Set [*]
Model Training Done [ ✓ ]
Predicting Data [*]
Data Prediction Done [ ✓ ]
Making Confusion Matrix [*]
[[935 101]
[159 214]]
Confusion Matrix Done [ ✓ ] Evaluating Model Performance [*] Validation Accuracy is : 0.815471965933286 Evaluating Model Performance [ ✓ ] Applying K-Fold Cross Validation [*] Accuracy: 79.84 % Standard Deviation: 1.15 % K-Fold Cross Validation [ ✓ ] Complete [ ✓ ] Time Elapsed : 0.556931734085083 seconds
classifier = Classification(predictor = 'svm')
classifier.fit(X, y)
result = classifier.result()
accuracy_scores[result['Classifier']] = result['Accuracy']
██╗░░░░░██╗░░░██╗░█████╗░██╗███████╗███████╗██████╗░░░░░░░███╗░░░███╗██╗░░░░░
██║░░░░░██║░░░██║██╔══██╗██║██╔════╝██╔════╝██╔══██╗░░░░░░████╗░████║██║░░░░░
██║░░░░░██║░░░██║██║░░╚═╝██║█████╗░░█████╗░░██████╔╝█████╗██╔████╔██║██║░░░░░
██║░░░░░██║░░░██║██║░░██╗██║██╔══╝░░██╔══╝░░██╔══██╗╚════╝██║╚██╔╝██║██║░░░░░
███████╗╚██████╔╝╚█████╔╝██║██║░░░░░███████╗██║░░██║░░░░░░██║░╚═╝░██║███████╗
╚══════╝░╚═════╝░░╚════╝░╚═╝╚═╝░░░░░╚══════╝╚═╝░░╚═╝░░░░░░╚═╝░░░░░╚═╝╚══════╝
Started LuciferML
Checking if labels or features are categorical! [*]
Features are Categorical
Encoding Features [*]
Encoding Features Done [ ✓ ]
Labels are not categorical [ ✓ ]
Checking for Categorical Variables Done [ ✓ ]
Checking for Sparse Matrix [*]
Splitting Data into Train and Validation Sets [*]
Splitting Done [ ✓ ]
Scaling Training and Test Sets [*]
Scaling Done [ ✓ ]
Training Support Vector Machine on Training Set [*]
Model Training Done [ ✓ ]
Predicting Data [*]
Data Prediction Done [ ✓ ]
Making Confusion Matrix [*]
[[946 90]
[177 196]]
Confusion Matrix Done [ ✓ ] Evaluating Model Performance [*] Validation Accuracy is : 0.8105039034776437 Evaluating Model Performance [ ✓ ] Applying K-Fold Cross Validation [*] Accuracy: 79.43 % Standard Deviation: 0.53 % K-Fold Cross Validation [ ✓ ] Complete [ ✓ ] Time Elapsed : 10.523306369781494 seconds
classifier = Classification(predictor = 'knn')
classifier.fit(X, y)
result = classifier.result()
accuracy_scores[result['Classifier']] = result['Accuracy']
██╗░░░░░██╗░░░██╗░█████╗░██╗███████╗███████╗██████╗░░░░░░░███╗░░░███╗██╗░░░░░
██║░░░░░██║░░░██║██╔══██╗██║██╔════╝██╔════╝██╔══██╗░░░░░░████╗░████║██║░░░░░
██║░░░░░██║░░░██║██║░░╚═╝██║█████╗░░█████╗░░██████╔╝█████╗██╔████╔██║██║░░░░░
██║░░░░░██║░░░██║██║░░██╗██║██╔══╝░░██╔══╝░░██╔══██╗╚════╝██║╚██╔╝██║██║░░░░░
███████╗╚██████╔╝╚█████╔╝██║██║░░░░░███████╗██║░░██║░░░░░░██║░╚═╝░██║███████╗
╚══════╝░╚═════╝░░╚════╝░╚═╝╚═╝░░░░░╚══════╝╚═╝░░╚═╝░░░░░░╚═╝░░░░░╚═╝╚══════╝
Started LuciferML
Checking if labels or features are categorical! [*]
Features are Categorical
Encoding Features [*]
Encoding Features Done [ ✓ ]
Labels are not categorical [ ✓ ]
Checking for Categorical Variables Done [ ✓ ]
Checking for Sparse Matrix [*]
Splitting Data into Train and Validation Sets [*]
Splitting Done [ ✓ ]
Scaling Training and Test Sets [*]
Scaling Done [ ✓ ]
Training K-Nearest Neighbours on Training Set [*]
Model Training Done [ ✓ ]
Predicting Data [*]
Data Prediction Done [ ✓ ]
Making Confusion Matrix [*]
[[889 147]
[187 186]]
Confusion Matrix Done [ ✓ ] Evaluating Model Performance [*] Validation Accuracy is : 0.7629524485450674 Evaluating Model Performance [ ✓ ] Applying K-Fold Cross Validation [*] Accuracy: 76.06 % Standard Deviation: 1.01 % K-Fold Cross Validation [ ✓ ] Complete [ ✓ ] Time Elapsed : 1.1291532516479492 seconds
classifier = Classification(predictor = 'dt')
classifier.fit(X, y)
result = classifier.result()
accuracy_scores[result['Classifier']] = result['Accuracy']
██╗░░░░░██╗░░░██╗░█████╗░██╗███████╗███████╗██████╗░░░░░░░███╗░░░███╗██╗░░░░░
██║░░░░░██║░░░██║██╔══██╗██║██╔════╝██╔════╝██╔══██╗░░░░░░████╗░████║██║░░░░░
██║░░░░░██║░░░██║██║░░╚═╝██║█████╗░░█████╗░░██████╔╝█████╗██╔████╔██║██║░░░░░
██║░░░░░██║░░░██║██║░░██╗██║██╔══╝░░██╔══╝░░██╔══██╗╚════╝██║╚██╔╝██║██║░░░░░
███████╗╚██████╔╝╚█████╔╝██║██║░░░░░███████╗██║░░██║░░░░░░██║░╚═╝░██║███████╗
╚══════╝░╚═════╝░░╚════╝░╚═╝╚═╝░░░░░╚══════╝╚═╝░░╚═╝░░░░░░╚═╝░░░░░╚═╝╚══════╝
Started LuciferML
Checking if labels or features are categorical! [*]
Features are Categorical
Encoding Features [*]
Encoding Features Done [ ✓ ]
Labels are not categorical [ ✓ ]
Checking for Categorical Variables Done [ ✓ ]
Checking for Sparse Matrix [*]
Splitting Data into Train and Validation Sets [*]
Splitting Done [ ✓ ]
Scaling Training and Test Sets [*]
Scaling Done [ ✓ ]
Training Decision Tree Classifier on Training Set [*]
Model Training Done [ ✓ ]
Predicting Data [*]
Data Prediction Done [ ✓ ]
Making Confusion Matrix [*]
[[846 190]
[203 170]]
Confusion Matrix Done [ ✓ ] Evaluating Model Performance [*] Validation Accuracy is : 0.7210787792760823 Evaluating Model Performance [ ✓ ] Applying K-Fold Cross Validation [*] Accuracy: 73.20 % Standard Deviation: 2.12 % K-Fold Cross Validation [ ✓ ] Complete [ ✓ ] Time Elapsed : 0.5701887607574463 seconds
classifier = Classification(predictor = 'nb')
classifier.fit(X, y)
result = classifier.result()
accuracy_scores[result['Classifier']] = result['Accuracy']
██╗░░░░░██╗░░░██╗░█████╗░██╗███████╗███████╗██████╗░░░░░░░███╗░░░███╗██╗░░░░░
██║░░░░░██║░░░██║██╔══██╗██║██╔════╝██╔════╝██╔══██╗░░░░░░████╗░████║██║░░░░░
██║░░░░░██║░░░██║██║░░╚═╝██║█████╗░░█████╗░░██████╔╝█████╗██╔████╔██║██║░░░░░
██║░░░░░██║░░░██║██║░░██╗██║██╔══╝░░██╔══╝░░██╔══██╗╚════╝██║╚██╔╝██║██║░░░░░
███████╗╚██████╔╝╚█████╔╝██║██║░░░░░███████╗██║░░██║░░░░░░██║░╚═╝░██║███████╗
╚══════╝░╚═════╝░░╚════╝░╚═╝╚═╝░░░░░╚══════╝╚═╝░░╚═╝░░░░░░╚═╝░░░░░╚═╝╚══════╝
Started LuciferML
Checking if labels or features are categorical! [*]
Features are Categorical
Encoding Features [*]
Encoding Features Done [ ✓ ]
Labels are not categorical [ ✓ ]
Checking for Categorical Variables Done [ ✓ ]
Checking for Sparse Matrix [*]
Splitting Data into Train and Validation Sets [*]
Splitting Done [ ✓ ]
Scaling Training and Test Sets [*]
Scaling Done [ ✓ ]
Training Naive Bayes Classifier on Training Set [*]
Model Training Done [ ✓ ]
Predicting Data [*]
Data Prediction Done [ ✓ ]
Making Confusion Matrix [*]
[[660 376]
[ 47 326]]
Confusion Matrix Done [ ✓ ] Evaluating Model Performance [*] Validation Accuracy is : 0.6997870830376153 Evaluating Model Performance [ ✓ ] Applying K-Fold Cross Validation [*] Accuracy: 68.73 % Standard Deviation: 1.98 % K-Fold Cross Validation [ ✓ ] Complete [ ✓ ] Time Elapsed : 0.2935006618499756 seconds
classifier = Classification(predictor = 'rfc')
classifier.fit(X, y)
result = classifier.result()
accuracy_scores[result['Classifier']] = result['Accuracy']
██╗░░░░░██╗░░░██╗░█████╗░██╗███████╗███████╗██████╗░░░░░░░███╗░░░███╗██╗░░░░░
██║░░░░░██║░░░██║██╔══██╗██║██╔════╝██╔════╝██╔══██╗░░░░░░████╗░████║██║░░░░░
██║░░░░░██║░░░██║██║░░╚═╝██║█████╗░░█████╗░░██████╔╝█████╗██╔████╔██║██║░░░░░
██║░░░░░██║░░░██║██║░░██╗██║██╔══╝░░██╔══╝░░██╔══██╗╚════╝██║╚██╔╝██║██║░░░░░
███████╗╚██████╔╝╚█████╔╝██║██║░░░░░███████╗██║░░██║░░░░░░██║░╚═╝░██║███████╗
╚══════╝░╚═════╝░░╚════╝░╚═╝╚═╝░░░░░╚══════╝╚═╝░░╚═╝░░░░░░╚═╝░░░░░╚═╝╚══════╝
Started LuciferML
Checking if labels or features are categorical! [*]
Features are Categorical
Encoding Features [*]
Encoding Features Done [ ✓ ]
Labels are not categorical [ ✓ ]
Checking for Categorical Variables Done [ ✓ ]
Checking for Sparse Matrix [*]
Splitting Data into Train and Validation Sets [*]
Splitting Done [ ✓ ]
Scaling Training and Test Sets [*]
Scaling Done [ ✓ ]
Training Random Forest Classifier on Training Set [*]
Model Training Done [ ✓ ]
Predicting Data [*]
Data Prediction Done [ ✓ ]
Making Confusion Matrix [*]
[[935 101]
[194 179]]
Confusion Matrix Done [ ✓ ] Evaluating Model Performance [*] Validation Accuracy is : 0.7906316536550745 Evaluating Model Performance [ ✓ ] Applying K-Fold Cross Validation [*] Accuracy: 78.01 % Standard Deviation: 1.28 % K-Fold Cross Validation [ ✓ ] Complete [ ✓ ] Time Elapsed : 8.826218843460083 seconds
classifier = Classification(predictor = 'gbc')
classifier.fit(X, y)
result = classifier.result()
accuracy_scores[result['Classifier']] = result['Accuracy']
██╗░░░░░██╗░░░██╗░█████╗░██╗███████╗███████╗██████╗░░░░░░░███╗░░░███╗██╗░░░░░
██║░░░░░██║░░░██║██╔══██╗██║██╔════╝██╔════╝██╔══██╗░░░░░░████╗░████║██║░░░░░
██║░░░░░██║░░░██║██║░░╚═╝██║█████╗░░█████╗░░██████╔╝█████╗██╔████╔██║██║░░░░░
██║░░░░░██║░░░██║██║░░██╗██║██╔══╝░░██╔══╝░░██╔══██╗╚════╝██║╚██╔╝██║██║░░░░░
███████╗╚██████╔╝╚█████╔╝██║██║░░░░░███████╗██║░░██║░░░░░░██║░╚═╝░██║███████╗
╚══════╝░╚═════╝░░╚════╝░╚═╝╚═╝░░░░░╚══════╝╚═╝░░╚═╝░░░░░░╚═╝░░░░░╚═╝╚══════╝
Started LuciferML
Checking if labels or features are categorical! [*]
Features are Categorical
Encoding Features [*]
Encoding Features Done [ ✓ ]
Labels are not categorical [ ✓ ]
Checking for Categorical Variables Done [ ✓ ]
Checking for Sparse Matrix [*]
Splitting Data into Train and Validation Sets [*]
Splitting Done [ ✓ ]
Scaling Training and Test Sets [*]
Scaling Done [ ✓ ]
Training Gradient Boosting Classifier on Training Set [*]
Model Training Done [ ✓ ]
Predicting Data [*]
Data Prediction Done [ ✓ ]
Making Confusion Matrix [*]
[[941 95]
[180 193]]
Confusion Matrix Done [ ✓ ] Evaluating Model Performance [*] Validation Accuracy is : 0.8048261178140526 Evaluating Model Performance [ ✓ ] Applying K-Fold Cross Validation [*] Accuracy: 79.82 % Standard Deviation: 1.61 % K-Fold Cross Validation [ ✓ ] Complete [ ✓ ] Time Elapsed : 16.009963989257812 seconds
classifier = Classification(predictor = 'bag')
classifier.fit(X, y)
result = classifier.result()
accuracy_scores[result['Classifier']] = result['Accuracy']
██╗░░░░░██╗░░░██╗░█████╗░██╗███████╗███████╗██████╗░░░░░░░███╗░░░███╗██╗░░░░░
██║░░░░░██║░░░██║██╔══██╗██║██╔════╝██╔════╝██╔══██╗░░░░░░████╗░████║██║░░░░░
██║░░░░░██║░░░██║██║░░╚═╝██║█████╗░░█████╗░░██████╔╝█████╗██╔████╔██║██║░░░░░
██║░░░░░██║░░░██║██║░░██╗██║██╔══╝░░██╔══╝░░██╔══██╗╚════╝██║╚██╔╝██║██║░░░░░
███████╗╚██████╔╝╚█████╔╝██║██║░░░░░███████╗██║░░██║░░░░░░██║░╚═╝░██║███████╗
╚══════╝░╚═════╝░░╚════╝░╚═╝╚═╝░░░░░╚══════╝╚═╝░░╚═╝░░░░░░╚═╝░░░░░╚═╝╚══════╝
Started LuciferML
Checking if labels or features are categorical! [*]
Features are Categorical
Encoding Features [*]
Encoding Features Done [ ✓ ]
Labels are not categorical [ ✓ ]
Checking for Categorical Variables Done [ ✓ ]
Checking for Sparse Matrix [*]
Splitting Data into Train and Validation Sets [*]
Splitting Done [ ✓ ]
Scaling Training and Test Sets [*]
Scaling Done [ ✓ ]
Training Bagging Classifier on Training Set [*]
Model Training Done [ ✓ ]
Predicting Data [*]
Data Prediction Done [ ✓ ]
Making Confusion Matrix [*]
[[935 101]
[210 163]]
Confusion Matrix Done [ ✓ ] Evaluating Model Performance [*] Validation Accuracy is : 0.7792760823278921 Evaluating Model Performance [ ✓ ] Applying K-Fold Cross Validation [*] Accuracy: 77.09 % Standard Deviation: 1.37 % K-Fold Cross Validation [ ✓ ] Complete [ ✓ ] Time Elapsed : 2.5434954166412354 seconds
plt.figure(figsize=(15, 6))
model_accuracies = list(accuracy_scores.values())
model_names = list(accuracy_scores.keys())
sns.barplot(x=model_accuracies, y=model_names, palette='mako')
<AxesSubplot:>
The model with highest K-Fold Validation Accuracy score is Logistic Regression with an accuracy of 79.84.
Conclusion
The company can now look at the data and predict the customers likes and dislikes. The bivariate and multivariate analysis can give more insights for the company to take decisons wisely and strategize their markeing value chain. This will make executives to conduct the effective customer retention programmes.
To get more insight on churn, pertaining to this data set, the parameters like internet speed and international calling should be included. I enjoyed working on this data set and it was effective.